FlashPCA2: principal component analysis of Biobank-scale genotype datasets
نویسندگان
چکیده
منابع مشابه
FlashPCA2: principal component analysis of Biobank-scale genotype datasets
Motivation Principal component analysis (PCA) is a crucial step in quality control of genomic data and a common approach for understanding population genetic structure. With the advent of large genotyping studies involving hundreds of thousands of individuals, standard approaches are no longer feasible. However, when the full decomposition is not required, substantial computational savings can ...
متن کاملPrincipal Component Analysis of Process Datasets with Missing Values
Datasets with missing values arising from causes such as sensor failure, inconsistent sampling rates, and merging data from different systems are common in the process industry. Methods for handling missing data typically operate during data pre-processing, but can also occur during model building. This article considers missing data within the context of principal component analysis (PCA), whi...
متن کاملCompression of Breast Cancer Images By Principal Component Analysis
The principle of dimensionality reduction with PCA is the representation of the dataset ‘X’in terms of eigenvectors ei ∈ RN of its covariance matrix. The eigenvectors oriented in the direction with the maximum variance of X in RN carry the most relevant information of X. These eigenvectors are called principal components [8]. Ass...
متن کاملCompression of Breast Cancer Images By Principal Component Analysis
The principle of dimensionality reduction with PCA is the representation of the dataset ‘X’in terms of eigenvectors ei ∈ RN of its covariance matrix. The eigenvectors oriented in the direction with the maximum variance of X in RN carry the most relevant information of X. These eigenvectors are called principal components [8]. Ass...
متن کاملPrincipal Component Projection Without Principal Component Analysis
We show how to efficiently project a vector onto the top principal components of a matrix, without explicitly computing these components. Specifically, we introduce an iterative algorithm that provably computes the projection using few calls to any black-box routine for ridge regression. By avoiding explicit principal component analysis (PCA), our algorithm is the first with no runtime dependen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bioinformatics
سال: 2017
ISSN: 1367-4803,1460-2059
DOI: 10.1093/bioinformatics/btx299